xen.git
6 years ago.gitignore: Add configure output which we always delete and regenerate
Ian Jackson [Fri, 5 Oct 2018 17:05:48 +0000 (18:05 +0100)]
.gitignore: Add configure output which we always delete and regenerate

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoautoconf: Provide libexec_libdir_suffix
Ian Jackson [Wed, 3 Oct 2018 15:25:58 +0000 (16:25 +0100)]
autoconf: Provide libexec_libdir_suffix

This is going to be used to put libfsimage.so into a path containing
the multiarch triplet.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools-libfsimage-prefix.diff
Ian Jackson [Fri, 5 Oct 2018 16:53:38 +0000 (17:53 +0100)]
tools-libfsimage-prefix.diff

Patch-Name: tools-libfsimage-prefix.diff

Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name tools-libfsimage-prefix.diff

6 years agotools-libfsimage-abiname.diff
Bastian Blank [Sat, 5 Jul 2014 09:46:47 +0000 (11:46 +0200)]
tools-libfsimage-abiname.diff

Patch-Name: tools-libfsimage-abiname.diff

Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name tools-libfsimage-abiname.diff

6 years agoDo not build the instruction emulator
Ian Jackson [Thu, 20 Sep 2018 17:10:14 +0000 (18:10 +0100)]
Do not build the instruction emulator

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools/tests/x86_emulator: Pass -no-pie -fno-pic to gcc on x86_32
Ian Jackson [Tue, 1 Nov 2016 16:20:27 +0000 (16:20 +0000)]
tools/tests/x86_emulator: Pass -no-pie -fno-pic to gcc on x86_32

The current build fails with GCC6 on Debian sid i386 (unstable):

 /tmp/ccqjaueF.s: Assembler messages:
 /tmp/ccqjaueF.s:3713: Error: missing or invalid displacement expression `vmovd_to_reg_len@GOT'

This is due to the combination of GCC6, and Debian's decision to
enable some hardening flags by default (to try to make runtime
addresses less predictable):
  https://wiki.debian.org/Hardening/PIEByDefaultTransition

This is of no benefit for the x86 instruction emulator test, which is
a rebuild of the emulator code for testing purposes only.  So pass
options to disable this.

These options will be no-ops if they are the same as the compiler
default.

On amd64, the -fno-pic breaks the build in a different way.  So do
this only on i386.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Gbp-Pq: Topic misc
Gbp-Pq: Name toolstestsx86_emulator-pass--no-pie--fno.patch

6 years agoRemove static solaris support from pygrub
Bastian Blank [Sat, 5 Jul 2014 09:47:29 +0000 (11:47 +0200)]
Remove static solaris support from pygrub

Patch-Name: tools-pygrub-remove-static-solaris-support

Gbp-Pq: Topic misc
Gbp-Pq: Name tools-pygrub-remove-static-solaris-support

6 years agotools-xenmon-install.diff
Bastian Blank [Sat, 5 Jul 2014 09:47:31 +0000 (11:47 +0200)]
tools-xenmon-install.diff

Patch-Name: tools-xenmon-install.diff

Gbp-Pq: Topic misc
Gbp-Pq: Name tools-xenmon-install.diff

6 years agoDo not ship COPYING into /usr/include
Bastian Blank [Sat, 5 Jul 2014 09:47:30 +0000 (11:47 +0200)]
Do not ship COPYING into /usr/include

This is not wanted in Debian.  COPYING ends up in
/usr/share/doc/xen-*copyright.

Patch-Name: tools-include-no-COPYING.diff

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoconfig-prefix.diff
Bastian Blank [Sat, 5 Jul 2014 09:46:45 +0000 (11:46 +0200)]
config-prefix.diff

Patch-Name: config-prefix.diff

Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name config-prefix.diff

6 years agoversion
Bastian Blank [Sat, 5 Jul 2014 09:46:43 +0000 (11:46 +0200)]
version

Patch-Name: version.diff

Gbp-Pq: Topic misc
Gbp-Pq: Name version.diff

6 years agotools/kdd: mute spurious gcc warning
Marek Marczykowski-Górecki [Thu, 5 Apr 2018 01:50:55 +0000 (03:50 +0200)]
tools/kdd: mute spurious gcc warning

gcc-8 complains:

    kdd.c:698:13: error: 'memcpy' offset [-204, -717] is out of the bounds [0, 216] of object 'ctrl' with type 'kdd_ctrl' {aka 'union <anonymous>'} [-Werror=array-bounds]
                 memcpy(buf, ((uint8_t *)&ctrl.c32) + offset, len);
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    kdd.c: In function 'kdd_select_callback':
    kdd.c:642:14: note: 'ctrl' declared here
         kdd_ctrl ctrl;
                  ^~~~

But this is impossible - 'offset' is unsigned and correctly validated
few lines before.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Release-Acked-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit 437e00fea04becc91c1b6bc1c0baa636b067a5cc)

6 years agolibxl/arm: Fix build on arm64 + acpi w/ gcc 8.2
Christopher Clark [Thu, 16 Aug 2018 20:22:41 +0000 (13:22 -0700)]
libxl/arm: Fix build on arm64 + acpi w/ gcc 8.2

Add zero-padding to #defined ACPI table strings that are copied.
Provides sufficient characters to satisfy the length required to
fully populate the destination and prevent array-bounds warnings.
Add BUILD_BUG_ON sizeof checks for compile-time length checking.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit b8f33431f3dd23fb43a879f4bdb4283fdc9465ad)

6 years agotools: Move ARRAY_SIZE() into xen-tools/libs.h
Andrew Cooper [Wed, 4 Jul 2018 13:32:31 +0000 (14:32 +0100)]
tools: Move ARRAY_SIZE() into xen-tools/libs.h

xen-tools/libs.h currently contains a shared BUILD_BUG_ON() implementation and
is used by some tools.  Extend this to include ARRAY_SIZE and clean up all the
opencoding.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit e1b7eb92d3ec6ce3ca68cffb36a148eb59f59613)

6 years agoxenpmd: make 32 bit gcc 8.1 non-debug build work
Wei Liu [Thu, 26 Jul 2018 14:58:54 +0000 (15:58 +0100)]
xenpmd: make 32 bit gcc 8.1 non-debug build work

32 bit gcc 8.1 non-debug build yields:

xenpmd.c:354:23: error: '%02x' directive output may be truncated writing between 2 and 8 bytes into a region of size 3 [-Werror=format-truncation=]
     snprintf(val, 3, "%02x",
                       ^~~~
xenpmd.c:354:22: note: directive argument in the range [40, 2147483778]
     snprintf(val, 3, "%02x",
                      ^~~~~~
xenpmd.c:354:5: note: 'snprintf' output between 3 and 9 bytes into a destination of size 3
     snprintf(val, 3, "%02x",
     ^~~~~~~~~~~~~~~~~~~~~~~~
              (unsigned int)(9*4 +
              ~~~~~~~~~~~~~~~~~~~~
                             strlen(info->model_number) +
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             strlen(info->serial_number) +
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             strlen(info->battery_type) +
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             strlen(info->oem_info) + 4));
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All info->* used in calculation are 32 bytes long, and the parsing
code makes sure they are null-terminated, so the end result of the
expression won't exceed 255, which should be able to be fit into 3
bytes in hexadecimal format.

Add an assertion to make gcc happy.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit e75c9dc85fdeeeda0b98d8cd8d784e0508c3ffb8)

6 years agoDelete configure output
Ian Jackson [Wed, 19 Sep 2018 15:53:22 +0000 (16:53 +0100)]
Delete configure output

These autogenerated files are not useful in Debian; dh_autoreconf will
regenerate them.

If this patch does not apply when rebasing, you can simply delete the
files again.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoDelete config.sub and config.guess
Ian Jackson [Wed, 19 Sep 2018 15:45:49 +0000 (16:45 +0100)]
Delete config.sub and config.guess

dh_autoreconf will provide these back.

If this patch does not apply when rebasing, you can simply delete the
files again.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools-xenstore-compatibility.diff
Bastian Blank [Sat, 5 Jul 2014 09:47:36 +0000 (11:47 +0200)]
tools-xenstore-compatibility.diff

Patch-Name: tools-xenstore-compatibility.diff

Gbp-Pq: Topic xenstore
Gbp-Pq: Name tools-xenstore-compatibility.diff

6 years agotools-fake-xs-restrict
Debian Xen Team [Fri, 24 Aug 2018 17:45:17 +0000 (18:45 +0100)]
tools-fake-xs-restrict

Gbp-Pq: Topic xenstore
Gbp-Pq: Name tools-fake-xs-restrict.patch

6 years agotools/debugger/kdd: Install as `xen-kdd', not just `kdd'
Ian Jackson [Fri, 28 Sep 2018 14:30:54 +0000 (15:30 +0100)]
tools/debugger/kdd: Install as `xen-kdd', not just `kdd'

`kdd' is an unfortunate namespace landgrab.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoxenmon: Install as xenmon, not xenmon.py
Ian Jackson [Fri, 28 Sep 2018 14:27:21 +0000 (15:27 +0100)]
xenmon: Install as xenmon, not xenmon.py

Adding the implementation language as a suffix to a program name is
poor practice.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agopygrub fsimage.so: Honour LDFLAGS when building
Ian Jackson [Thu, 4 Oct 2018 11:32:00 +0000 (12:32 +0100)]
pygrub fsimage.so: Honour LDFLAGS when building

This seems to have been simply omitted.  Obviously this is needed when
building and not just when installing.  Passing only when installing
is ineffective.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agolibfsimage: Honour general LDFLAGS
Ian Jackson [Thu, 4 Oct 2018 11:31:25 +0000 (12:31 +0100)]
libfsimage: Honour general LDFLAGS

Do not reset LDFLAGS to empty.  Instead, append the fsimage-special
LDFLAGS.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agogdbsx: Honour LDFLAGS when linking
Ian Jackson [Thu, 4 Oct 2018 11:30:37 +0000 (12:30 +0100)]
gdbsx: Honour LDFLAGS when linking

This command does the link, so it needs LDFLAGS.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools/xenstat: Fix shared library version
Bastian Blank [Sat, 5 Jul 2014 09:46:50 +0000 (11:46 +0200)]
tools/xenstat: Fix shared library version

libxenstat does not have a stable ABI.  Set its version to the current
Xen release version.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agodocs/man/xen-pv-channel.pod.7: Remove a spurious blank line
Ian Jackson [Wed, 3 Oct 2018 17:43:55 +0000 (18:43 +0100)]
docs/man/xen-pv-channel.pod.7: Remove a spurious blank line

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agodocs/man: Provide properly-formatted NAME sections
Ian Jackson [Wed, 3 Oct 2018 17:42:42 +0000 (18:42 +0100)]
docs/man: Provide properly-formatted NAME sections

A manpage `foo.7.pod' must start with

  =head NAME

  foo - some summary of what foo is or what this manpage is

because otherwise manpage catalogue systems cannot generate a proper
`whatis' entry.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoINSTALL: Mention kconfig
Ian Jackson [Fri, 21 Sep 2018 14:40:19 +0000 (15:40 +0100)]
INSTALL: Mention kconfig

Firstly, add a reference to the documentation for the kconfig system.

Secondly, warn the user about the XEN_CONFIG_EXPERT problem.

CC: Doug Goldstein <cardoe@cardoe.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools/Rules.mk: Honour PREPEND_LDFLAGS_XEN_TOOLS
Ian Jackson [Fri, 5 Oct 2018 16:52:54 +0000 (17:52 +0100)]
tools/Rules.mk: Honour PREPEND_LDFLAGS_XEN_TOOLS

This allows the caller to provide some LDFLAGS to the Xen build
system.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools/xentop : replace use of deprecated vwprintw
Christopher Clark [Wed, 18 Jul 2018 22:22:17 +0000 (15:22 -0700)]
tools/xentop : replace use of deprecated vwprintw

gcc-8.1 complains:

| xentop.c: In function 'print':
| xentop.c:304:4: error: 'vwprintw' is deprecated [-Werror=deprecated-declarations]
|     vwprintw(stdscr, (curses_str_t)fmt, args);
|     ^~~~~~~~

vw_printw (note the underscore) is a non-deprecated alternative.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Gbp-Pq: Topic misc
Gbp-Pq: Name tools-xentop-replace-use-of-deprecated-vwprintw.patch

6 years agoVarious: Fix typo `mappping'
Ian Jackson [Wed, 3 Oct 2018 18:00:22 +0000 (19:00 +0100)]
Various: Fix typo `mappping'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoVarious: Fix typo `infomation'
Ian Jackson [Wed, 3 Oct 2018 17:59:18 +0000 (18:59 +0100)]
Various: Fix typo `infomation'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools/python/xen/lowlevel: Fix typo `sucess'
Ian Jackson [Wed, 3 Oct 2018 17:57:13 +0000 (18:57 +0100)]
tools/python/xen/lowlevel: Fix typo `sucess'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoVarious: Fix typo `reseting'
Ian Jackson [Wed, 3 Oct 2018 17:56:39 +0000 (18:56 +0100)]
Various: Fix typo `reseting'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoVarious: Fix typo `occured'
Ian Jackson [Wed, 3 Oct 2018 17:55:36 +0000 (18:55 +0100)]
Various: Fix typo `occured'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoVarious: Fix typos `unkown', `retreive' (detected by lintian)
Ian Jackson [Wed, 3 Oct 2018 17:51:50 +0000 (18:51 +0100)]
Various: Fix typos `unkown', `retreive' (detected by lintian)

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agotools/xentrace/xenalyze: Fix typos detected by lintian
Ian Jackson [Wed, 3 Oct 2018 17:46:47 +0000 (18:46 +0100)]
tools/xentrace/xenalyze: Fix typos detected by lintian

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agodocs/man: Fix two typos detected by the Debian lintian tool
Ian Jackson [Wed, 3 Oct 2018 17:44:18 +0000 (18:44 +0100)]
docs/man: Fix two typos detected by the Debian lintian tool

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
6 years agoUpdate changelog for new upstream 4.11.1+92-g6c33308a8d
Hans van Kranenburg [Tue, 18 Jun 2019 07:50:19 +0000 (09:50 +0200)]
Update changelog for new upstream 4.11.1+92-g6c33308a8d

[git-debrebase changelog: new upstream 4.11.1+92-g6c33308a8d]

6 years agoUpdate to upstream 4.11.1+92-g6c33308a8d
Hans van Kranenburg [Tue, 18 Jun 2019 07:50:19 +0000 (09:50 +0200)]
Update to upstream 4.11.1+92-g6c33308a8d

[git-debrebase anchor: new upstream 4.11.1+92-g6c33308a8d, merge]

6 years agod/changelog: finalise 4.11.1+58-g3b062f5040-1 (fix spaces)
Hans van Kranenburg [Tue, 14 May 2019 16:56:57 +0000 (18:56 +0200)]
d/changelog: finalise 4.11.1+58-g3b062f5040-1 (fix spaces)

Bleh... sometimes it's too late when you see something wrong, so an
extra commit. :) This is better.

6 years agod/changelog: finalise 4.11.1+58-g3b062f5040-1
Hans van Kranenburg [Mon, 13 May 2019 19:16:09 +0000 (21:16 +0200)]
d/changelog: finalise 4.11.1+58-g3b062f5040-1

6 years agoxen/arm: mm: Set-up page permission for Xen mappings earlier on
Julien Grall [Thu, 29 Nov 2018 11:37:43 +0000 (11:37 +0000)]
xen/arm: mm: Set-up page permission for Xen mappings earlier on

Xen mapping is first create using a 2MB page and then shatterred in 4KB
page for fine-graine permission. However, it is not safe to break-down
superpage page without going to an intermediate step invalidating
the entry.

As we are changing Xen mappings, we cannot go through the intermediate
step. The only solution is to create Xen mapping using 4KB entries
directly. As the Xen should always access the mappings according with
the runtime permission, it is then possible to set-up the permissions
while create the mapping.

We are still playing with the fire as there are still some
break-before-make issue in setup_pagetables (i.e switch between 2 sets of
page-tables). But it should slightly be better than the current state.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Reported-by: Jan-Peter Larsson <Jan-Peter.Larsson@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Matthew Daley <mattd@bugfuzz.com>
(cherry picked from commit 00c96d77422a4b84247bec5dadf434363d312cac)

6 years agolibacpi: report PCI slots as enabled only for hotpluggable devices
Igor Druzhinin [Thu, 6 Jun 2019 12:11:24 +0000 (14:11 +0200)]
libacpi: report PCI slots as enabled only for hotpluggable devices

DSDT for qemu-xen lacks _STA method of PCI slot object. If _STA method
doesn't exist then the slot is assumed to be always present and active
which in conjunction with _EJ0 method makes every device ejectable for
an OS even if it's not the case.

qemu-kvm is able to dynamically add _EJ0 method only to those slots
that either have hotpluggable devices or free for PCI passthrough.
As Xen lacks this capability we cannot use their way.

qemu-xen-traditional DSDT has _STA method which only reports that
the slot is present if there is a PCI devices hotplugged there.
This is done through querying of its PCI hotplug controller.
qemu-xen has similar capability that reports if device is "hotpluggable
or absent" which we can use to achieve the same result.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 6761965243b113230bed900d6105be05b28f5cea
master date: 2019-05-24 10:30:21 +0200

6 years agox86/IO-APIC: fix build with gcc9
Jan Beulich [Thu, 6 Jun 2019 12:11:09 +0000 (14:11 +0200)]
x86/IO-APIC: fix build with gcc9

There are a number of pointless __packed attributes which cause gcc 9 to
legitimately warn:

utils.c: In function 'vtd_dump_iommu_info':
utils.c:287:33: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member]
  287 |                 remap = (struct IO_APIC_route_remap_entry *) &rte;
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~

intremap.c: In function 'ioapic_rte_to_remap_entry':
intremap.c:343:25: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member]
  343 |     remap_rte = (struct IO_APIC_route_remap_entry *) old_rte;
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~

Simply drop these attributes. Take the liberty and also re-format the
structure definitions at the same time.

Reported-by: Charles Arnold <carnold@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: ca9310b24e6205de5387e5982ccd42c35caf89d4
master date: 2019-05-24 10:19:59 +0200

6 years agox86emul: add support for missing {,V}PMADDWD insns
Jan Beulich [Thu, 6 Jun 2019 12:10:46 +0000 (14:10 +0200)]
x86emul: add support for missing {,V}PMADDWD insns

Their pre-AVX512 incarnations have clearly been overlooked during much
earlier work. Their memory access pattern is entirely standard, so no
specific tests get added to the harness.

Reported-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 1a48bdd599b268a2d9b7d0c45f1fd40c4892186e
master date: 2019-05-16 13:43:17 +0200

6 years agox86/IRQ: avoid UB (or worse) in trace_irq_mask()
Jan Beulich [Thu, 6 Jun 2019 12:09:56 +0000 (14:09 +0200)]
x86/IRQ: avoid UB (or worse) in trace_irq_mask()

Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
copying has to be restricted to the actual allocation size. This is
particulary important since the function doesn't bail early when tracing
is not active, so even production builds would be affected by potential
misbehavior here.

Take the opportunity and also
- use initializers instead of assignment + memset(),
- constify the cpumask_t input pointer,
- u32 -> uint32_t.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
master commit: 6fafb8befa99620a2d7323b9eca5c387bad1f59f
master date: 2019-05-13 16:41:03 +0200

6 years agox86/boot: Fix latent memory corruption with early_boot_opts_t
Andrew Cooper [Thu, 6 Jun 2019 12:09:37 +0000 (14:09 +0200)]
x86/boot: Fix latent memory corruption with early_boot_opts_t

c/s ebb26b509f "xen/x86: make VGA support selectable" added an #ifdef
CONFIG_VIDEO into the middle the backing space for early_boot_opts_t,
but didn't adjust the structure definition in cmdline.c

This only functions correctly because the affected fields are at the end
of the structure, and cmdline.c doesn't write to them in this case.

To retain the slimming effect of compiling out CONFIG_VIDEO, adjust
cmdline.c with enough #ifdef-ary to make C's idea of the structure match
the declaration in asm.  This requires adding __maybe_unused annotations
to two helper functions.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 30596213617fcf4dd7b71d244e16c8fc0acf456b
master date: 2019-05-13 10:35:38 +0100

6 years agox86/svm: Fix handling of ICEBP intercepts
Andrew Cooper [Thu, 6 Jun 2019 12:09:20 +0000 (14:09 +0200)]
x86/svm: Fix handling of ICEBP intercepts

c/s 9338a37d "x86/svm: implement debug events" added support for introspecting
ICEBP debug exceptions, but didn't account for the fact that
svm_get_insn_len() (previously __get_instruction_length) can fail and may
already have raised #GP with the guest.

If svm_get_insn_len() fails, return back to guest context rather than
continuing and mistaking a trap-style VMExit for a fault-style one.

Spotted by Coverity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Brian Woods <brian.woods@amd.com>
master commit: 1495b4ff9b4af2b9c0f12cdb6491082cecf34f86
master date: 2019-05-13 10:35:37 +0100

6 years agodrivers/video: drop framebuffer size constraints
Marek Marczykowski-Górecki [Thu, 6 Jun 2019 12:08:29 +0000 (14:08 +0200)]
drivers/video: drop framebuffer size constraints

The limit 1900x1200 do not match real world devices (1900 looks like a
typo, should be 1920). But in practice the limits are arbitrary and do
not serve any real purpose. As discussed in "Increase framebuffer size
to todays standards" thread, drop them completely.

This fixes graphic console on device with 3840x2160 native resolution.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
drivers/video: drop unused limits

MAX_BPP, MAX_FONT_W, MAX_FONT_H are not used in the code at all.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 19600eb75aa9b1df3e4b0a4e55a5d08b957e1fd9
master date: 2019-05-13 10:13:24 +0200
master commit: 343459e34a6d32ba44a21f8b8fe4c1f69b1714c2
master date: 2019-05-13 10:12:56 +0200

6 years agobitmap: fix bitmap_fill with zero-sized bitmap
Marek Marczykowski-Górecki [Thu, 6 Jun 2019 12:08:10 +0000 (14:08 +0200)]
bitmap: fix bitmap_fill with zero-sized bitmap

When bitmap_fill(..., 0) is called, do not try to write anything. Before
this patch, it tried to write almost LONG_MAX, surely overwriting
something.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 93df28be2d4f620caf18109222d046355ac56327
master date: 2019-05-13 10:12:00 +0200

6 years agox86/vmx: correctly gather gs_shadow value for current vCPU
Tamas K Lengyel [Thu, 6 Jun 2019 12:07:54 +0000 (14:07 +0200)]
x86/vmx: correctly gather gs_shadow value for current vCPU

Currently the gs_shadow value is only cached when the vCPU is being scheduled
out by Xen. Reporting this (usually) stale value through vm_event is incorrect,
since it doesn't represent the actual state of the vCPU at the time the event
was recorded. This prevents vm_event subscribers from correctly finding kernel
structures in the guest when it is trapped while in ring3.

Refresh shadow_gs value when the context being saved is for the current vCPU.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
master commit: f69fc1c2f36e8a74ba54c9c8fa5c904ea1ad319e
master date: 2019-05-13 09:55:59 +0200

6 years agox86/mtrr: recalculate P2M type for domains with iocaps
Igor Druzhinin [Thu, 6 Jun 2019 12:07:06 +0000 (14:07 +0200)]
x86/mtrr: recalculate P2M type for domains with iocaps

This change reflects the logic in epte_get_entry_emt() and allows
changes in guest MTTRs to be reflected in EPT for domains having
direct access to certain hardware memory regions but without IOMMU
context assigned (e.g. XenGT).

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: f3d880bf2be92534c5bacf11de2f561cbad550fb
master date: 2019-05-13 09:54:45 +0200

6 years agoAMD/IOMMU: disable previously enabled IOMMUs upon init failure
Jan Beulich [Thu, 6 Jun 2019 12:06:49 +0000 (14:06 +0200)]
AMD/IOMMU: disable previously enabled IOMMUs upon init failure

If any IOMMUs were successfully initialized before encountering failure,
the successfully enabled ones should be disabled again before cleaning
up their resources.

Move disable_iommu() next to enable_iommu() to avoid a forward
declaration, and take the opportunity to remove stray blank lines ahead
of both functions' final closing braces.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>
master commit: 87a3347d476443c66c79953d77d6aef1d2bb3bbd
master date: 2019-05-13 09:52:43 +0200

6 years agotrace: fix build with gcc9
Jan Beulich [Thu, 6 Jun 2019 12:06:29 +0000 (14:06 +0200)]
trace: fix build with gcc9

While I've not observed this myself, gcc 9 (imo validly) reportedly may
complain

trace.c: In function '__trace_hypercall':
trace.c:826:19: error: taking address of packed member of 'struct <anonymous>' may result in an unaligned pointer value [-Werror=address-of-packed-member]
  826 |     uint32_t *a = d.args;

and the fix is rather simple - remove the __packed attribute. Introduce
a BUILD_BUG_ON() as replacement, for the unlikely case that Xen might
get ported to an architecture where array alignment higher that that of
its elements.

Reported-by: Martin Liška <martin.liska@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
master commit: 3fd3b266d4198c06e8e421ca515d9ba09ccd5155
master date: 2019-05-13 09:51:23 +0200

6 years agoxen/sched: fix csched2_deinit_pdata()
Juergen Gross [Mon, 27 May 2019 13:55:20 +0000 (15:55 +0200)]
xen/sched: fix csched2_deinit_pdata()

Commit 753ba43d6d16e688 ("xen/sched: fix credit2 smt idle handling")
introduced a regression when switching cpus between cpupools.

When assigning a cpu to a cpupool with credit2 being the default
scheduler csched2_deinit_pdata() is called for the credit2 private data
after the new scheduler's private data has been hooked to the per-cpu
scheduler data. Unfortunately csched2_deinit_pdata() will cycle through
all per-cpu scheduler areas it knows of for removing the cpu from the
respective sibling masks including the area of the just moved cpu. This
will (depending on the new scheduler) either clobber the data of the
new scheduler or in case of sched_rt lead to a crash.

Avoid that by removing the cpu from the list of active cpus in credit2
data first.

The opposite problem is occurring when removing a cpu from a cpupool:
init_pdata() of credit2 will access the per-cpu data of the old
scheduler.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
master commit: ffd3367ed682b6ac6f57fcb151921054dd4cce7e
master date: 2019-05-17 15:41:17 +0200

6 years agooxenstored: Don't re-open a xenctrl handle for every domain introduction
Andrew Cooper [Wed, 3 Oct 2018 09:32:54 +0000 (10:32 +0100)]
oxenstored: Don't re-open a xenctrl handle for every domain introduction

Currently, an xc handle is opened in main() which is used for cleanup
activities, and a new xc handle is temporarily opened every time a domain is
introduced.  This is inefficient, and amongst other things, requires full root
privileges for the lifetime of oxenstored.

All code using the Xenctrl handle is in domains.ml, so initialise xc as a
global (now happens just before main() is called) and drop it as a parameter
from Domains.create and Domains.cleanup.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
(cherry picked from commit 129025fe30934c6a04bbd9c05ade479d34ce4985)

6 years agoxl: handle PVH type in apply_global_affinity_masks again
Wei Liu [Fri, 12 Apr 2019 10:03:25 +0000 (11:03 +0100)]
xl: handle PVH type in apply_global_affinity_masks again

A call site in create_domain can call it with PVH type. That site was
missed during the review of 48dab9767.

Reinstate PVH type in the switch.

Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit 860d6e158dbb581c3aabc6a20ae8d83b325bffd8)
(cherry picked from commit b4f291b0ca914454cbac9fa5580bb35f8ab04eee)

6 years agotools/libxc: Fix issues with libxc and Xen having different featureset lengths
Andrew Cooper [Thu, 29 Nov 2018 18:10:38 +0000 (18:10 +0000)]
tools/libxc: Fix issues with libxc and Xen having different featureset lengths

In almost all cases, Xen and libxc will agree on the featureset length,
because they are built from the same source.

However, there are circumstances (e.g. security hotfixes) where the featureset
gets longer and dom0 will, after installing updates, be running with an old
Xen but new libxc.  Despite writing the code with this scenario in mind, there
were some bugs.

First, xen-cpuid's get_featureset() erroneously allocates a buffer based on
Xen's featureset length, but records libxc's length, which may be longer.

In this situation, the hypercall bounce buffer code reads/writes the recorded
length, which is beyond the end of the allocated object, and a later free()
encounters corrupt heap metadata.  Fix this by recording the same length that
we allocate.

Secondly, get_cpuid_domain_info() has a related bug when the passed-in
featureset is a different length to libxc's.

A large amount of the libxc cpuid functionality depends on info->featureset
being as long as expected, and it is allocated appropriately.  However, in the
case that a shorter external featureset is passed in, the logic to check for
trailing nonzero bits may read off the end of it.  Rework the logic to use the
correct upper bound.

In addition, leave a comment next to the fields in struct cpuid_domain_info
explaining the relationship between the various lengths, and how to cope with
different lengths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit c393b64dcee6684da25257b033148740cb6d7ff0)

6 years agotools/xl: use libxl_domain_info to get domain type for vcpu-pin
Igor Druzhinin [Tue, 9 Apr 2019 12:01:58 +0000 (13:01 +0100)]
tools/xl: use libxl_domain_info to get domain type for vcpu-pin

Parsing the config seems to be an overkill for this particular task
and the config might simply be absent. Type returned from libxl_domain_info
should be either LIBXL_DOMAIN_TYPE_HVM or LIBXL_DOMAIN_TYPE_PV but in
that context distinction between PVH and HVM should be irrelevant.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit 48dab9767d2eb173495707cb1fd8ceaf73604ac1)
(cherry picked from commit c59579d8319b776ae6243da1999737e2b4737710)

6 years agotools/libxl: correct vcpu affinity output with sparse physical cpu map
Juergen Gross [Fri, 31 Aug 2018 15:22:04 +0000 (17:22 +0200)]
tools/libxl: correct vcpu affinity output with sparse physical cpu map

With not all physical cpus online (e.g. with smt=0) the output of hte
vcpu affinities is wrong, as the affinity bitmaps are capped after
nr_cpus bits, instead of using max_cpu_id.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit 2ec5339ec9218fbf1583fa85b74d1d2f15f1b3b8)

6 years agotools/ocaml: Dup2 /dev/null to stdin in daemonize()
Christian Lindig [Wed, 27 Feb 2019 10:33:42 +0000 (10:33 +0000)]
tools/ocaml: Dup2 /dev/null to stdin in daemonize()

Don't close stdin in daemonize() but dup2 /dev/null instead.  Otherwise, fd 0
gets reused later:

  [root@idol ~]# ls -lav /proc/`pgrep xenstored`/fd
  total 0
  dr-x------ 2 root root  0 Feb 28 11:02 .
  dr-xr-xr-x 9 root root  0 Feb 27 15:59 ..
  lrwx------ 1 root root 64 Feb 28 11:02 0 -> /dev/xen/evtchn
  l-wx------ 1 root root 64 Feb 28 11:02 1 -> /dev/null
  l-wx------ 1 root root 64 Feb 28 11:02 2 -> /dev/null
  lrwx------ 1 root root 64 Feb 28 11:02 3 -> /dev/xen/privcmd
  ...

Signed-off-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit 677e64dbe315343620c3b266e9eb16623b118038)

6 years agotools/misc/xenpm: fix getting info when some CPUs are offline
Marek Marczykowski-Górecki [Wed, 31 Oct 2018 13:04:58 +0000 (14:04 +0100)]
tools/misc/xenpm: fix getting info when some CPUs are offline

Use physinfo.max_cpu_id instead of physinfo.nr_cpus to get max CPU id.
This fixes for example 'xenpm get-cpufreq-para' with smt=off, which
otherwise would miss half of the cores.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit ffb60a58df48419c1f2607cd3cc919fa2bfc9c2d)

6 years agox86: fix build race when generating temporary object files
Jan Beulich [Wed, 15 May 2019 07:49:35 +0000 (09:49 +0200)]
x86: fix build race when generating temporary object files

The rules to generate xen-syms and xen.efi may run in parallel, but both
recursively invoke $(MAKE) to build symbol/relocation table temporary
object files. These recursive builds would both re-generate the .*.d2
files (where needed). Both would in turn invoke the same rule, thus
allowing for a race on the .*.d2.tmp intermediate files.

The dependency files of the temporary .xen*.o files live in xen/ rather
than xen/arch/x86/ anyway, so won't be included no matter what. Take the
opportunity and delete them, as the just re-generated .xen*.S files will
trigger a proper re-build of the .xen*.o ones anyway.

Empty the DEPS variable in case the set of goals consists of just those
temporary object files, thus eliminating the race.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 761bb575ce97255029d2d2249b2719e54bc76825
master date: 2019-04-11 10:25:05 +0200

6 years agoVT-d: posted interrupts require interrupt remapping
Jan Beulich [Wed, 15 May 2019 07:49:04 +0000 (09:49 +0200)]
VT-d: posted interrupts require interrupt remapping

Initially I had just noticed the unnecessary indirection in the call
from pi_update_irte(). The generic wrapper having an iommu_intremap
conditional made me look at the setup code though. So first of all
enforce the necessary dependency.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 6c54663786d9f1ed04153867687c158675e7277d
master date: 2019-04-09 15:12:07 +0200

6 years agovm_event: fix XEN_VM_EVENT_RESUME domctl
Petre Pircalabu [Wed, 15 May 2019 07:48:28 +0000 (09:48 +0200)]
vm_event: fix XEN_VM_EVENT_RESUME domctl

Make XEN_VM_EVENT_RESUME return 0 in case of success, instead of
-EINVAL.
Remove vm_event_resume form vm_event.h header and set the function's
visibility to static as is used only in vm_event.c.
Move the vm_event_check_ring test inside vm_event_resume in order to
simplify the code.

Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
master commit: b32c0446b103aa801ee18780b2fdd78dfc0b9052
master date: 2019-04-05 15:42:03 +0200

6 years agoxen/timers: Fix memory leak with cpu unplug/plug
Andrew Cooper [Wed, 15 May 2019 07:47:32 +0000 (09:47 +0200)]
xen/timers: Fix memory leak with cpu unplug/plug

timer_softirq_action() realloc's itself a larger timer heap whenever
necessary, which includes bootstrapping from the empty dummy_heap.  Nothing
ever freed this allocation.

CPU plug and unplug has the side effect of zeroing the percpu data area, which
clears ts->heap.  This in turn causes new timers to be put on the list rather
than the heap, and for timer_softirq_action() to bootstrap itself again.

This in practice leaks ts->heap every time a CPU is unplugged and replugged.

Implement free_percpu_timers() which includes freeing ts->heap when
appropriate, and update the notifier callback with the recent cpu parking
logic and free-avoidance across suspend.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/cpu: Fix ARM build following c/s 597fbb8

c/s 597fbb8 "xen/timers: Fix memory leak with cpu unplug/plug" broke the ARM
build by being the first patch to add park_offline_cpus to common code.

While it is currently specific to Intel hardware (for reasons of being able to
handle machine check exceptions without an immediate system reset), it isn't
inherently architecture specific, so define it to be false on ARM for now.

Add a comment in both smp.h headers explaining the intended behaviour of the
option.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
timers: move back migrate_timers_from_cpu() invocation

Commit 597fbb8be6 ("xen/timers: Fix memory leak with cpu unplug/plug")
went a little too far: Migrating timers away from a CPU being offlined
needs to heppen independent of whether it get parked or fully offlined.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
xen/timers: Fix memory leak with cpu unplug/plug (take 2)

Previous attempts to fix this leak failed to identify the root cause, and
ultimately failed.  The cause is the CPU_UP_PREPARE case (re)initialising
ts->heap back to dummy_heap, which leaks the previous allocation.

Rearrange the logic to only initialise ts once.  This also avoids the
redundant (but benign, due to ts->inactive always being empty) initialising of
the other ts fields.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 597fbb8be6021440cd53493c14201c32671bade1
master date: 2019-04-08 11:16:06 +0100
master commit: a6448adfd3d537aacbbd784e5bf1777ab3ff5f85
master date: 2019-04-09 10:12:57 +0100
master commit: 1aec95350ac8261cba516371710d4d837c26f6a0
master date: 2019-04-15 17:51:30 +0100
master commit: e978e9ed9e1ff0dc326e72708ed03cac2ba41db8
master date: 2019-05-13 10:35:37 +0100

6 years agox86emul: suppress general register update upon AVX gather failures
Jan Beulich [Wed, 15 May 2019 07:46:41 +0000 (09:46 +0200)]
x86emul: suppress general register update upon AVX gather failures

While destination and mask registers may indeed need updating in this
case, the rIP update in particular needs to be avoided, as well as e.g.
raising a single step trap.

Reported-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 74f299bbd7d5cc52325b5866c17b44dd0bd1c5a2
master date: 2019-04-03 10:14:32 +0200

6 years agoxen/sched: fix credit2 smt idle handling
Juergen Gross [Wed, 15 May 2019 07:45:58 +0000 (09:45 +0200)]
xen/sched: fix credit2 smt idle handling

Credit2's smt_idle_mask_set() and smt_idle_mask_clear() are used to
identify idle cores where vcpus can be moved to. A core is thought to
be idle when all siblings are known to have the idle vcpu running on
them.

Unfortunately the information of a vcpu running on a cpu is per
runqueue. So in case not all siblings are in the same runqueue a core
will never be regarded to be idle, as the sibling not in the runqueue
is never known to run the idle vcpu.

Use a credit2 specific cpumask of siblings with only those cpus
being marked which are in the same runqueue as the cpu in question.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
master commit: 753ba43d6d16e688f688e01e1c77463ea2c6ec9f
master date: 2019-03-29 18:28:21 +0000

6 years agox86/spec-ctrl: Introduce options to control VERW flushing
Andrew Cooper [Wed, 12 Dec 2018 19:22:15 +0000 (19:22 +0000)]
x86/spec-ctrl: Introduce options to control VERW flushing

The Microarchitectural Data Sampling vulnerability is split into categories
with subtly different properties:

 MLPDS - Microarchitectural Load Port Data Sampling
 MSBDS - Microarchitectural Store Buffer Data Sampling
 MFBDS - Microarchitectural Fill Buffer Data Sampling
 MDSUM - Microarchitectural Data Sampling Uncacheable Memory

MDSUM is a special case of the other three, and isn't distinguished further.

These issues pertain to three microarchitectural buffers.  The Load Ports, the
Store Buffers and the Fill Buffers.  Each of these structures are flushed by
the new enhanced VERW functionality, but the conditions under which flushing
is necessary vary.

For this concise overview of the issues and default logic, the abbreviations
SP (Store Port), FB (Fill Buffer), LP (Load Port) and HT (Hyperthreading) are
used for brevity:

 * Vulnerable hardware is divided into two categories - parts which suffer
   from SP only, and parts with any other combination of vulnerabilities.

 * SP only has an HT interaction when the thread goes idle, due to the static
   partitioning of resources.  LP and FB have HT interactions at all points,
   due to the competitive sharing of resources.  All issues potentially leak
   data across the return-to-guest transition.

 * The microcode which implements VERW flushing also extends MSR_FLUSH_CMD, so
   we don't need to do both on the HVM return-to-guest path.  However, some
   parts are not vulnerable to L1TF (therefore have no MSR_FLUSH_CMD), but are
   vulnerable to MDS, so do require VERW on the HVM path.

Note that we deliberately support mds=1 even without MD_CLEAR in case the
microcode has been updated but the feature bit not exposed.

This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3c04c258ab40405a74e194d9889a4cbc7abe94b4)

6 years agox86/spec-ctrl: Infrastructure to use VERW to flush pipeline buffers
Andrew Cooper [Wed, 12 Dec 2018 19:22:15 +0000 (19:22 +0000)]
x86/spec-ctrl: Infrastructure to use VERW to flush pipeline buffers

Three synthetic features are introduced, as we need individual control of
each, depending on circumstances.  A later change will enable them at
appropriate points.

The verw_sel field doesn't strictly need to live in struct cpu_info.  It lives
there because there is a convenient hole it can fill, and it reduces the
complexity of the SPEC_CTRL_EXIT_TO_{PV,HVM} assembly by avoiding the need for
any temporary stack maintenance.

This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 548a932ac786d6bf3584e4b54f2ab993e1117710)

6 years agox86/spec-ctrl: CPUID/MSR definitions for Microarchitectural Data Sampling
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/spec-ctrl: CPUID/MSR definitions for Microarchitectural Data Sampling

The MD_CLEAR feature can be automatically offered to guests.  No
infrastructure is needed in Xen to support the guest making use of it.

This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit d4f6116c080dc013cd1204c4d8ceb95e5f278689)

6 years agox86/spec-ctrl: Misc non-functional cleanup
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/spec-ctrl: Misc non-functional cleanup

 * Identify BTI in the spec_ctrl_{enter,exit}_idle() comments, as other
   mitigations will shortly appear.
 * Use alternative_input() and cover the lack of memory cobber with a further
   barrier.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 9b62eba6c429c327e1507816bef403ccc87357ae)

6 years agox86/boot: Detect the firmware SMT setting correctly on Intel hardware
Andrew Cooper [Fri, 5 Apr 2019 12:26:30 +0000 (13:26 +0100)]
x86/boot: Detect the firmware SMT setting correctly on Intel hardware

While boot_cpu_data.x86_num_siblings is an accurate value to use on AMD
hardware, it isn't on Intel when the user has disabled Hyperthreading in the
firmware.  As a result, a user which has chosen to disable HT still gets
nagged on L1TF-vulnerable hardware when they haven't chosen an explicit
smt=<bool> setting.

Make use of the largely-undocumented MSR_INTEL_CORE_THREAD_COUNT which in
practice exists since Nehalem, when booting on real hardware.  Fall back to
using the ACPI table APIC IDs.

While adjusting this logic, fix a latent bug in amd_get_topology().  The
thread count field in CPUID.0x8000001e.ebx is documented as 8 bits wide,
rather than 2 bits wide.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit b12fec4a125950240573ea32f65c61fb9afa74c3)

6 years agox86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
Andrew Cooper [Fri, 5 Apr 2019 12:26:30 +0000 (12:26 +0000)]
x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT

This is a model specific register which details the current configuration
cores and threads in the package.  Because of how Hyperthread and Core
configuration works works in firmware, the MSR it is de-facto constant and
will remain unchanged until the next system reset.

It is a read only MSR (so unilaterally reject writes), but for now retain its
leaky-on-read properties.  Further CPUID/MSR work is required before we can
start virtualising a consistent topology to the guest, and retaining the old
behaviour is the safest course of action.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit d4120936bcd1695faf5b575f1259c58e31d2b18b)

6 years agox86/spec-ctrl: Reposition the XPTI command line parsing logic
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/spec-ctrl: Reposition the XPTI command line parsing logic

It has ended up in the middle of the mitigation calculation logic.  Move it to
be beside the other command line parsing.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit c2c2bb0d60c642e64a5243a79c8b1548ffb7bc5b)

6 years agoUpdate changelog for new upstream 4.11.1+58-g3b062f5040
Hans van Kranenburg [Mon, 13 May 2019 19:54:56 +0000 (21:54 +0200)]
Update changelog for new upstream 4.11.1+58-g3b062f5040

[git-debrebase changelog: new upstream 4.11.1+58-g3b062f5040]

6 years agoUpdate to upstream 4.11.1+58-g3b062f5040
Hans van Kranenburg [Mon, 13 May 2019 19:54:56 +0000 (21:54 +0200)]
Update to upstream 4.11.1+58-g3b062f5040

[git-debrebase anchor: new upstream 4.11.1+58-g3b062f5040, merge]

6 years agofinalise 4.11.1+26-g87f51bf366-3
Ian Jackson [Thu, 28 Feb 2019 16:38:37 +0000 (16:38 +0000)]
finalise 4.11.1+26-g87f51bf366-3

6 years agox86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts
Andrew Cooper [Fri, 3 May 2019 08:55:55 +0000 (10:55 +0200)]
x86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts

All currently-released Atom processors are in practice retpoline-safe, because
they don't fall back to a BTB prediction on RSB underflow.

However, an additional meaning of Enhanced IRBS is that the processor may not
be retpoline-safe.  The Gemini Lake platform, based on the Goldmont Plus
microarchitecture is the first Atom processor to support eIBRS.

Until Xen gets full eIBRS support, Gemini Lake will still be safe using
regular IBRS.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit: 17f74242ccf0ce6e51c03a5860947865c0ef0dc2
master date: 2019-03-18 16:26:40 +0000

6 years agox86/msr: Shorten ARCH_CAPABILITIES_* constants
Andrew Cooper [Fri, 3 May 2019 08:55:10 +0000 (10:55 +0200)]
x86/msr: Shorten ARCH_CAPABILITIES_* constants

They are unnecesserily verbose, and ARCH_CAPS_* is already the more common
version.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit: ba27aaa88548c824a47dcf5609288ee1c05d2946
master date: 2019-03-18 16:26:40 +0000

6 years agox86/e820: fix build with gcc9
Jan Beulich [Fri, 3 May 2019 08:53:40 +0000 (10:53 +0200)]
x86/e820: fix build with gcc9

e820.c: In function ‘clip_to_limit’:
.../xen/include/asm/string.h:10:26: error: ‘__builtin_memmove’ offset [-16, -36] is out of the bounds [0, 20484] of object ‘e820’ with type ‘struct e820map’ [-Werror=array-bounds]
   10 | #define memmove(d, s, n) __builtin_memmove(d, s, n)
      |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~
e820.c:404:13: note: in expansion of macro ‘memmove’
  404 |             memmove(&e820.map[i], &e820.map[i+1],
      |             ^~~~~~~
e820.c:36:16: note: ‘e820’ declared here
   36 | struct e820map e820;
      |                ^~~~

While I can't see where the negative offsets would come from, converting
the loop index to unsigned type helps. Take the opportunity and also
convert several other local variables and copy_e820_map()'s second
parameter to unsigned int (and bool in one case).

Reported-by: Charles Arnold <carnold@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 22e2f8dddf5fbed885b5e4db3ffc9e1101be9ec0
master date: 2019-03-18 11:38:36 +0100

6 years agoxen: Fix backport of "xen/cmdline: Fix buggy strncmp(s, LITERAL, ss - s) construct"
Andrew Cooper [Fri, 3 May 2019 08:52:32 +0000 (10:52 +0200)]
xen: Fix backport of "xen/cmdline: Fix buggy strncmp(s, LITERAL, ss - s) construct"

These were missed as a consequence of being rebased over other cmdline
cleanup.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen: Fix backport of "x86/tsx: Implement controls for RTM force-abort mode"
Andrew Cooper [Fri, 3 May 2019 08:51:31 +0000 (10:51 +0200)]
xen: Fix backport of "x86/tsx: Implement controls for RTM force-abort mode"

The posted version of this patch depends on c/s 3c555295 "x86/vpmu: Improve
documentation and parsing for vpmu=" (Xen 4.12 and later) to prevent
`vpmu=rtm-abort` impliying `vpmu=1`, which is outside of security support.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/firmware: update OVMF Makefile, when necessary
Wei Liu [Wed, 28 Nov 2018 17:43:33 +0000 (17:43 +0000)]
tools/firmware: update OVMF Makefile, when necessary

[ This is two commits from master aka staging-4.12: ]

OVMF has become dependent on OpenSSL, which is included as a
submodule.  Initialise submodules before building.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit b16281870e06f5f526029a4e69634a16dc38e8e4)

tools: only call git when necessary in OVMF Makefile

Users may choose to export a snapshot of OVMF and build it
with xen.git supplied ovmf-makefile. In that case we don't
need to call `git submodule`.

Fixes b16281870e.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit 68292c94a60eab24514ab4a8e4772af24dead807)

6 years agoArm/atomic: correct asm() constraints in build_add_sized()
Jan Beulich [Tue, 12 Mar 2019 13:42:17 +0000 (14:42 +0100)]
Arm/atomic: correct asm() constraints in build_add_sized()

The memory operand is an in/out one, and the auxiliary register gets
written to early.

Take the opportunity and also drop the redundant cast (the inline
functions' parameters are already of the casted-to type).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
(cherry picked from commit 51ceb1623b9956440f1b9943c67010a90d61f5c5)

7 years agox86/pv: Fix construction of 32bit dom0's
Andrew Cooper [Mon, 18 Mar 2019 16:09:08 +0000 (17:09 +0100)]
x86/pv: Fix construction of 32bit dom0's

dom0_construct_pv() has logic to transition dom0 into a compat domain when
booting an ELF32 image.

One aspect which is missing is the CPUID policy recalculation, meaning that a
32bit dom0 sees a 64bit policy, which differ by the Long Mode feature flag in
particular.  Another missing item is the x87_fip_width initialisation.

Update dom0_construct_pv() to use switch_compat(), rather than retaining the
opencoding.  Position the call to switch_compat() such that the compat32 local
variable can disappear entirely.

The 32bit monitor table is now created by setup_compat_l4(), avoiding the need
to for manual creation later.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 356f437171c5bb90701ac9dd7ba4dbbd05988e38
master date: 2019-03-15 14:59:27 +0000

7 years agox86/tsx: Implement controls for RTM force-abort mode
Andrew Cooper [Mon, 18 Mar 2019 16:08:25 +0000 (17:08 +0100)]
x86/tsx: Implement controls for RTM force-abort mode

The CPUID bit and MSR are deliberately not exposed to guests, because they
won't exist on newer processors.  As vPMU isn't security supported, the
misbehaviour of PCR3 isn't expected to impact production deployments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 6be613f29b4205349275d24367bd4c82fb2960dd
master date: 2019-03-12 17:05:21 +0000

7 years agox86/vtd: Don't include control register state in the table pointers
Andrew Cooper [Mon, 18 Mar 2019 16:07:45 +0000 (17:07 +0100)]
x86/vtd: Don't include control register state in the table pointers

iremap_maddr and qinval_maddr point to the base of a block of contiguous RAM,
allocated by the driver, holding the Interrupt Remapping table, and the Queued
Invalidation ring.

Despite their name, they are actually the values of the hardware register,
including control metadata in the lower 12 bits.  While uses of these fields
do appear to correctly shift out the metadata, this is very subtle behaviour
and confusing to follow.

Nothing uses the metadata, so make the fields actually point at the base of
the relevant tables.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
master commit: a9a05aeee10a5a3763a41305a9f38112dd1fcc82
master date: 2019-03-12 13:57:13 +0000

7 years agox86/HVM: don't crash guest in hvmemul_find_mmio_cache()
Jan Beulich [Mon, 18 Mar 2019 16:07:11 +0000 (17:07 +0100)]
x86/HVM: don't crash guest in hvmemul_find_mmio_cache()

Commit 35a61c05ea ("x86emul: adjust handling of AVX2 gathers") builds
upon the fact that the domain will actually survive running out of MMIO
result buffer space. Drop the domain_crash() invocation. Also delay
incrementing of the usage counter, such that the function can't possibly
use/return an out-of-bounds slot/pointer in case execution subsequently
makes it into the function again without a prior reset of state.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
master commit: a43c1dec246bdee484e6a3de001cc6850a107abe
master date: 2019-03-12 14:39:46 +0100

7 years agoiommu: leave IOMMU enabled by default during kexec crash transition
Igor Druzhinin [Mon, 18 Mar 2019 16:06:37 +0000 (17:06 +0100)]
iommu: leave IOMMU enabled by default during kexec crash transition

It's unsafe to disable IOMMU on a live system which is the case
if we're crashing since remapping hardware doesn't usually know what
to do with ongoing bus transactions and frequently raises NMI/MCE/SMI,
etc. (depends on the firmware configuration) to signal these abnormalities.
This, in turn, doesn't play well with kexec transition process as there is
no handling available at the moment for this kind of events resulting
in failures to enter the kernel.

Modern Linux kernels taught to copy all the necessary DMAR/IR tables
following kexec from the previous kernel (Xen in our case) - so it's
currently normal to keep IOMMU enabled. It might require minor changes to
kdump command line that enables IOMMU drivers (e.g. intel_iommu=on /
intremap=on) but recent kernels don't require any additional changes for
the transition to be transparent.

A fallback option is still left for compatibility with ancient crash
kernels which didn't like to have IOMMU active under their feet on boot.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit: 12c36f577d454996c882ecdc5da8113ca2613646
master date: 2019-03-12 14:38:12 +0100

7 years agox86/cpuid: add missing PCLMULQDQ dependency
Jan Beulich [Mon, 18 Mar 2019 16:05:45 +0000 (17:05 +0100)]
x86/cpuid: add missing PCLMULQDQ dependency

Since we can't seem to be able to settle our discussion for the wider
adjustment previously posted, let's at least add the missing dependency
for 4.12. I'm not convinced though that attaching it to SSE is correct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: eeb31ee522c7bb8541eb4c037be2c42bfcf0a3c3
master date: 2019-03-05 18:04:23 +0100

7 years agox86/mm: fix #GP(0) in switch_cr3_cr4()
Jan Beulich [Mon, 18 Mar 2019 16:05:07 +0000 (17:05 +0100)]
x86/mm: fix #GP(0) in switch_cr3_cr4()

With "pcid=no-xpti" and opposite XPTI settings in two 64-bit PV domains
(achievable with one of "xpti=no-dom0" or "xpti=no-domu"), switching
from a PCID-disabled to a PCID-enabled 64-bit PV domain fails to set
CR4.PCIDE in time, as CR4.PGE would not be set in either (see
pv_fixup_guest_cr4(), in particular as used by write_ptbase()), and
hence the early CR4 write would be skipped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: fdc2056767ba74346dfd8bbe868bb22521ba1418
master date: 2019-03-05 17:02:36 +0100

7 years agox86/nmi: correctly check MSB of P6 performance counter MSR in watchdog
Igor Druzhinin [Mon, 18 Mar 2019 16:04:30 +0000 (17:04 +0100)]
x86/nmi: correctly check MSB of P6 performance counter MSR in watchdog

The logic currently tries to work out if a recent overflow (that indicates
that NMI comes from the watchdog) happened by checking MSB of performance
counter MSR that is initially sign extended from a negative value
that we program it to. A possibly incorrect assumption here is that
MSB is always bit 32 while on modern hardware it's usually 47 and
the actual bit-width is reported through CPUID. Checking bit 32 for
overflows is usually fine since we never program it to anything
exceeding 32-bits and NMI is handled shortly after overflow occurs.

A problematic scenario that we saw occurs on systems where SMIs taking
significant time are possible. In that case, NMI handling is deferred to
the point firmware exits SMI which might take enough time for the counter
to go through bit 32 and set it to 1 again. So the logic described above
will misread it and report an unknown NMI erroneously.

Fortunately, we can use the actual MSB, which is usually higher than the
currently hardcoded 32, and treat this case correctly at least on modern
hardware.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 0452d02b6e7849537914dd30cbfc8eb27cdad2ce
master date: 2019-02-28 13:44:40 +0000

7 years agox86/hvm: Increase the triple fault log message level to XENLOG_ERR
Andrew Cooper [Mon, 18 Mar 2019 16:03:41 +0000 (17:03 +0100)]
x86/hvm: Increase the triple fault log message level to XENLOG_ERR

At INFO level, it doesn't get printed out by default in release builds,
leading to unqualified logging such as this:

  (XEN) [   66.995993] Freed 524kB init memory
  (XEN) [ 1993.144997] *** Dumping Dom9 vcpu#2 state: ***
  (XEN) [ 1993.145008] ----[ Xen-4.11.1  x86_64  debug=n   Not tainted ]----
  (XEN) [ 1993.145011] CPU:    21
  (XEN) [ 1993.145015] RIP:    0010:[<ffffe0002ba950ef>]
  (XEN) [ 1993.145018] RFLAGS: 0000000000010246   CONTEXT: hvm guest (d9v2)
  (XEN) [ 1993.145026] rax: 00000000ffffe000   rbx: ffffe0002d8e1440   rcx: 0000ffffe0002ba9
  (XEN) [ 1993.145031] rdx: 0000000000000000   rsi: ffffe0002ba93575   rdi: fffff803dfb9f340
  (XEN) [ 1993.145035] rbp: ffffd001cd791200   rsp: ffffd001cd791140   r8:  0000000000000130
  (XEN) [ 1993.145039] r9:  0000000080000000   r10: 0000000000000000   r11: 0000000000000020
  (XEN) [ 1993.145043] r12: ffffe0002ba9306d   r13: 0000000000000000   r14: 0000000000000001
  (XEN) [ 1993.145047] r15: fffff803dfb9f200   cr0: 0000000080050031   cr4: 0000000000170678
  (XEN) [ 1993.145051] cr3: 00000000001aa002   cr2: 0000020488403f70
  (XEN) [ 1993.145056] fsb: 0000000060f71000   gsb: ffffd001cc1af000   gss: 0000009d60f6f000
  (XEN) [ 1993.145060] ds: 002b   es: 002b   fs: 0053   gs: 002b   ss: 0018   cs: 0010

A triple fault is fatal to the domain under all circumstances (so will print
at most once), and in practice is always an error condition rather than a
reboot fallback.

Reported-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit: 1c8ca185e3c6e003398471edd9dbac0cd118137c
master date: 2019-02-28 11:16:27 +0000

7 years agox86/vmx: Properly flush the TLB when an altp2m is modified
Andrew Cooper [Mon, 18 Mar 2019 16:02:49 +0000 (17:02 +0100)]
x86/vmx: Properly flush the TLB when an altp2m is modified

Modifications to an altp2m mark the p2m as needing flushing, but this was
never wired up in the return-to-guest path.  As a result, stale TLB entries
can remain after resuming the guest.

In practice, this manifests as a missing EPT_VIOLATION or #VE exception when
the guest subsequently accesses a page which has had its permissions reduced.

vmx_vmenter_helper() now has 11 p2ms to potentially invalidate, but issuing 11
INVEPT instructions isn't clever.  Instead, count how many contexts need
invalidating, and use INVEPT_ALL_CONTEXT if two or more are in need of
flushing.

This doesn't have an XSA because altp2m is not yet a security-supported
feature.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
master commit: 69f7643df68ef8e994221a996e336a47cbb7bbc8
master date: 2019-02-28 11:16:27 +0000

7 years agox86/shadow: don't use map_domain_page_global() on paths that may not fail
Jan Beulich [Mon, 18 Mar 2019 16:02:02 +0000 (17:02 +0100)]
x86/shadow: don't use map_domain_page_global() on paths that may not fail

The assumption (according to one comment) and hope (according to
another) that map_domain_page_global() can't fail are both wrong on
large enough systems. Do away with the guest_vtable field altogether,
and establish / tear down the desired mapping as necessary.

The alternatives, discarded as being undesirable, would have been to
either crash the guest in sh_update_cr3() when the mapping fails, or to
bubble up an error indicator, which upper layers would have a hard time
to deal with (other than again by crashing the guest).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
master commit: 43282a5e64da26fad544e0100abf35048cf65b46
master date: 2019-02-26 16:56:26 +0100

7 years agoviridian: fix the HvFlushVirtualAddress/List hypercall implementation
Paul Durrant [Mon, 18 Mar 2019 16:01:21 +0000 (17:01 +0100)]
viridian: fix the HvFlushVirtualAddress/List hypercall implementation

The current code uses hvm_asid_flush_vcpu() but this is insufficient for
a guest running in shadow mode, which results in guest crashes early in
boot if the 'hcall_remote_tlb_flush' is enabled.

This patch, instead of open coding a new flush algorithm, adapts the one
already used by the HVMOP_flush_tlbs Xen hypercall. The implementation is
modified to allow TLB flushing a subset of a domain's vCPUs. A callback
function determines whether or not a vCPU requires flushing. This mechanism
was chosen because, while it is the case that the currently implemented
viridian hypercalls specify a vCPU mask, there are newer variants which
specify a sparse HV_VP_SET and thus use of a callback will avoid needing to
expose details of this outside of the viridian subsystem if and when those
newer variants are implemented.

NOTE: Use of the common flush function requires that the hypercalls are
      restartable and so, with this patch applied, viridian_hypercall()
      can now return HVM_HCALL_preempted. This is safe as no modification
      to struct cpu_user_regs is done before the return.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: ce98ee3050a824994ce4957faa8f53ecb8c7da9d
master date: 2019-02-26 16:55:06 +0100

7 years agox86/shadow: don't pass wrong L4 MFN to guest_walk_tables()
Jan Beulich [Mon, 18 Mar 2019 16:00:28 +0000 (17:00 +0100)]
x86/shadow: don't pass wrong L4 MFN to guest_walk_tables()

64-bit PV guest user mode runs on a different L4 table. Make sure
- the accessed bit gets set in the correct table (and in log-dirty
  mode the correct page gets marked dirty) during guest walks,
- the correct table gets audited by sh_audit_gw(),
- correct info gets logged by print_gw().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
master commit: db2af23d15077605f286d8ef86c8f5d9c1b8302a
master date: 2019-02-20 17:07:17 +0100

7 years agox86/pmtimer: fix hvm_acpi_sleep_button behavior
Varad Gautam [Mon, 18 Mar 2019 15:59:43 +0000 (16:59 +0100)]
x86/pmtimer: fix hvm_acpi_sleep_button behavior

Commit 19fb14622e941 "x86/pmtimer: move ACPI registers from PMTState to
hvm_domain" misconfigures pm1a_sts for hvm_acpi_sleep_button with
PWRBTN_STS instead of SLPBTN_STS, which leads to
XEN_DOMCTL_SENDTRIGGER_SLEEP causing guest powerdowns. Fix this.

Signed-off-by: Varad Gautam <vrd@amazon.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: b22c900c44a2db8db1c53e269e152206e55c273f
master date: 2019-02-20 17:06:25 +0100